MCRNR: Fast Computing of Restricted Nash Responses by Means of Sampling

نویسندگان

  • Marc J. V. Ponsen
  • Marc Lanctot
  • Steven de Jong
چکیده

This paper presents a sample-based algorithm for the computation of restricted Nash strategies in complex extensive form games. Recent work indicates that regret-minimization algorithms using selective sampling, such as Monte-Carlo Counterfactual Regret Minimization (MCCFR), converge faster to Nash equilibrium (NE) strategies than their non-sampled counterparts which perform a full tree traversal. In this paper, we show that MCCFR is also able to establish NE strategies in the complex domain of Poker. Although such strategies are defensive (i.e. safe to play), they are oblivious to opponent mistakes. We can thus achieve better performance by using (an estimation of) opponent strategies. The Restricted Nash Response (RNR) algorithm was proposed to learn robust counter-strategies given such knowledge. It solves a modified game, wherein it is assumed that opponents play according to a fixed strategy with a certain probability, or to a regret-minimizing strategy otherwise. We improve the rate of convergence of the RNR algorithm using sampling. Our new algorithm, MCRNR, samples only relevant parts of the game tree. It is therefore able to converge faster to robust bestresponse strategies than RNR. We evaluate our algorithm on a variety of imperfect information games that are small enough to solve yet large enough to be strategically interesting, as well as a large game, Texas Hold’em Poker.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computing Approximate Nash Equilibria and Robust Best-Responses Using Sampling

This article discusses two contributions to decision-making in complex partially observable stochastic games. First, we apply two state-of-the-art search techniques that use Monte-Carlo sampling to the task of approximating a Nash-Equilibrium (NE) in such games, namely Monte-Carlo Tree Search (MCTS) and Monte-Carlo Counterfactual Regret Minimization (MCCFR). MCTS has been proven to approximate ...

متن کامل

Choosing Samples to Compute Heuristic-Strategy Nash Equilibrium

Auctions define games of incomplete information for which it is often too hard to compute the exact Bayesian-Nash equilibrium. Instead, the infinite strategy space is often populated with heuristic strategies, such as myopic best-response to prices. Given these heuristic strategies, it can be useful to evaluate the strategies and the auction design by computing a Nash equilibrium across the res...

متن کامل

An Exact Double-Oracle Algorithm for Zero-Sum Extensive-Form Games with Imperfect Information

Developing scalable solution algorithms is one of the central problems in computational game theory. We present an iterative algorithm for computing an exact Nash equilibrium for two-player zero-sum extensive-form games with imperfect information. Our approach combines two key elements: (1) the compact sequence-form representation of extensiveform games and (2) the algorithmic framework of doub...

متن کامل

Double-oracle algorithm for computing an exact nash equilibrium in zero-sum extensive-form games

We investigate an iterative algorithm for computing an exact Nash equilibrium in two-player zero-sum extensive-form games with imperfect information. The approach uses the sequence-form representation of extensive-form games and the double-oracle algorithmic framework. The main idea is to restrict the game by allowing the players to play only some of the sequences of available actions, then ite...

متن کامل

Using and comparing metaheuristic algorithms for optimizing bidding strategy viewpoint of profit maximization of generators

With the formation of the competitive electricity markets in the world, optimization of bidding strategies has become one of the main discussions in studies related to market designing. Market design is challenged by multiple objectives that need to be satisfied. The solution of those multi-objective problems is searched often over the combined strategy space, and thus requires the simultaneous...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010